Background Information and Description of the Core Business Problem

American consumers have always had a contemptuous relationship with their FI (Financial Institutions) over a variety of issues and those issues have evolved over time. But more than ever, as banking products/services and other debt instruments increase their reach into the lives of Americans in different facets there are a greater number of consumers who are generally dissatisfied with their FI’s and this trend is heading upwards.

A satisfied customer base is necessary for an FI to be successful and to retain and grow their footprint in an increasingly competitive industry but a host of complex issues makes it more difficult for FI’s to connect with their customers and keep them happy. As technology moves towards web3 and a DeFI environment it is imperative that FI’s do what is necessary to repair a history of distrust and contempt they have with their customers.

In this report, we will examine what issues FI’s face why the are facing them and what actionable steps can be taken to increase satisfaction rates amongst their customers. For the analysis, we will be using the available public data on consumer complaints to the CFPB (Consumer Financial Protection Bureau) found at https://files.consumerfinance.gov/ccdb/complaints.csv.zip


Data Exploration and Analysis

This is a fairly large dataset so we will analyze the data in R Studio using SQL with the sqldf package. All steps can be followed and recreated using the original RMD file with the original raw data files rather than this summarized HTML report. Therefore, for this report we will only be going over the most relevant steps to solve this business case.

Our exploration of the data begins with a high-level view of the data with all the fields in view. We see each record is a customer complaint about an FI or a credit reporting agency (Equifax, Transunion and Experian). For this case study, we are strictly analyzing only FI’s so we will split the dataset in two separated by if the complaint is about an FI or a credit reporting agency.

sqldf("SELECT * FROM all_complaints LIMIT 6")
##   Date.received
## 1    2019-05-03
## 2    2022-01-15
## 3    2019-10-05
## 4    2022-01-14
## 5    2022-01-14
## 6    2022-01-14
##                                                                        Product
## 1                                                                 Student loan
## 2 Credit reporting, credit repair services, or other personal consumer reports
## 3 Credit reporting, credit repair services, or other personal consumer reports
## 4                                                  Credit card or prepaid card
## 5                                                                     Mortgage
## 6 Credit reporting, credit repair services, or other personal consumer reports
##                      Sub.product
## 1           Private student loan
## 2               Credit reporting
## 3 Other personal consumer report
## 4              Store credit card
## 5     Conventional home mortgage
## 6               Credit reporting
##                                                     Issue
## 1                           Struggling to repay your loan
## 2                    Incorrect information on your report
## 3                    Incorrect information on your report
## 4 Advertising and marketing, including promotional offers
## 5                              Struggling to pay mortgage
## 6                    Incorrect information on your report
##                                                 Sub.issue
## 1 Can't get other flexible options for repaying your loan
## 2                           Account information incorrect
## 3                    Public record information inaccurate
## 4          Didn't receive advertised or promotional terms
## 5                                                        
## 6                           Account information incorrect
##   Consumer.complaint.narrative
## 1                             
## 2                             
## 3                             
## 4                             
## 5                             
## 6                             
##                                                                           Company.public.response
## 1                                                                                                
## 2                                                                                                
## 3                                                                                                
## 4                                                                                                
## 5                                                                                                
## 6 Company has responded to the consumer and the CFPB and chooses not to provide a public response
##                               Company State ZIP.code Tags
## 1                 Ameritech Financial    FL    33023     
## 2 Experian Information Solutions Inc.    NV    89146     
## 3                       EQUIFAX, INC.    CA    91401     
## 4                 SYNCHRONY FINANCIAL    IN    46052     
## 5                       LoanCare, LLC    LA    70056     
## 6       CREDIT ACCEPTANCE CORPORATION    IN    46235     
##   Consumer.consent.provided. Submitted.via Date.sent.to.company
## 1       Consent not provided           Web           2019-05-15
## 2                                      Web           2022-01-15
## 3       Consent not provided           Web           2019-10-05
## 4                                      Web           2022-01-14
## 5                                      Web           2022-01-14
## 6                                      Web           2022-01-14
##   Company.response.to.consumer Timely.response. Consumer.disputed. Complaint.ID
## 1            Untimely response               No                N/A      3231257
## 2                  In progress              Yes                N/A      5111916
## 3      Closed with explanation              Yes                N/A      3396987
## 4                  In progress              Yes                N/A      5107466
## 5                  In progress              Yes                N/A      5098455
## 6  Closed with monetary relief              Yes                N/A      5110031
sqldf("SELECT Company, COUNT(*) AS total_complaints FROM all_complaints GROUP BY Company ORDER BY total_complaints DESC LIMIT 10")
##                                   Company total_complaints
## 1                           EQUIFAX, INC.           382727
## 2  TRANSUNION INTERMEDIATE HOLDINGS, INC.           288037
## 3     Experian Information Solutions Inc.           266199
## 4   BANK OF AMERICA, NATIONAL ASSOCIATION           109078
## 5                   WELLS FARGO & COMPANY            92904
## 6                    JPMORGAN CHASE & CO.            86155
## 7                          CITIBANK, N.A.            72220
## 8       CAPITAL ONE FINANCIAL CORPORATION            62741
## 9                 Navient Solutions, LLC.            35683
## 10                    SYNCHRONY FINANCIAL            35456


Firstly, let’s explore the credit agency dataset and immediately we see that 62% of all complaints about the credit agencies is them having incorrect information (Issues of Incorrect information on your report and Incorrect information on credit report). This seems like a significant metric so we will record this finding and try to shine some light on it contextually in the rest of the analysis.

sqldf("SELECT Issue, count(*) AS Total FROM ca_complaints GROUP BY Issue ORDER BY Total DESC LIMIT 10")
##                                                                               Issue
## 1                                              Incorrect information on your report
## 2  Problem with a credit reporting company's investigation into an existing problem
## 3                                            Incorrect information on credit report
## 4                                                       Improper use of your report
## 5                                          Credit reporting company's investigation
## 6                                                 Attempts to collect debt not owed
## 7                                  Unable to get your credit report or credit score
## 8                                          Unable to get credit report/credit score
## 9                                     Problem with fraud alerts or security freezes
## 10                          Credit monitoring or identity theft protection services
##     Total
## 1  472486
## 2  218374
## 3   97279
## 4   63988
## 5   16344
## 6   14494
## 7   13291
## 8   10535
## 9    9930
## 10   5154


When we shift our attention to the FI complaints we see that Attempts to collect on debt not owed is the biggest complaint from customers. In conjunction with this query we will also query the FI Products that receive the most complaints.

sqldf("SELECT Issue, count(*) AS Total FROM fi_complaints GROUP BY Issue ORDER BY Total DESC LIMIT 10")
##                                                                               Issue
## 1                                                 Attempts to collect debt not owed
## 2                                          Loan modification,collection,foreclosure
## 3                                              Incorrect information on your report
## 4                                          Loan servicing, payments, escrow account
## 5                                                               Managing an account
## 6                                             Cont'd attempts collect debt not owed
## 7                                                   Written notification about debt
## 8                                                    Trouble during payment process
## 9                                                             Communication tactics
## 10 Problem with a credit reporting company's investigation into an existing problem
##     Total
## 1  112668
## 2  112292
## 3  104178
## 4   77310
## 5   69779
## 6   60574
## 7   57448
## 8   54727
## 9   46580
## 10  44584


Debt products as well as credit reporting are the biggest complaint getters. The margin by which credit reporting is the number one complaint does speak volume and possibly how this could overlap with complaints from the credit agencies.

sqldf("SELECT Product, COUNT(*) AS Total FROM fi_complaints GROUP BY Product ORDER by Total DESC LIMIT 10")
##                                                                         Product
## 1                                                               Debt collection
## 2                                                                      Mortgage
## 3  Credit reporting, credit repair services, or other personal consumer reports
## 4                                                   Credit card or prepaid card
## 5                                                   Checking or savings account
## 6                                                                   Credit card
## 7                                                       Bank account or service
## 8                                                                  Student loan
## 9                            Money transfer, virtual currency, or money service
## 10                                                                Consumer Loan
##     Total
## 1  385730
## 2  345358
## 3  162427
## 4  131938
## 5  110985
## 6   89074
## 7   86157
## 8   64464
## 9   36557
## 10  31477


The FI’s response to complaints would be of an interest to know because we could see if the FI’s are admitting fault for any of the issues customers are having. The number one response of “Company has responded to the consumer and the CFPB and chooses not to provide a public response “ is very ambiguous and upon clarifying with the CFPB this response is not an admission of culpability so we are unable to infer a conclusion from this. However, it does make sense that if an FI was at fault that this would be the response they would choose.

sqldf("SELECT `Company.public.response`, count(*) AS total FROM fi_complaints GROUP BY `Company.public.response` ORDER BY total DESC LIMIT 10")
##                                                                                                    Company.public.response
## 1                                                                                                                         
## 2                          Company has responded to the consumer and the CFPB and chooses not to provide a public response
## 3                                                 Company believes it acted appropriately as authorized by contract or law
## 4                                                                         Company chooses not to provide a public response
## 5                                                       Company believes the complaint is the result of a misunderstanding
## 6                                                                    Company disputes the facts presented in the complaint
## 7                                                            Company believes complaint is the result of an isolated error
## 8  Company believes complaint caused principally by actions of third party outside the control or direction of the company
## 9                           Company believes complaint represents an opportunity for improvement to better serve consumers
## 10                                                              Company can't verify or dispute the facts in the complaint
##      total
## 1  1013585
## 2   332753
## 3    99646
## 4    33970
## 5     9689
## 6     8097
## 7     5287
## 8     5254
## 9     3801
## 10    3508


Customer.complaint.narrative is a manual field that customers can choose to write an explanation of their complaints. Most FI customers overwhelmingly decide to leave this field blank but perhaps by looking for reappearing key words or phrases we can gain some insight.

sqldf("SELECT `Consumer.complaint.narrative`, COUNT(*) AS total FROM fi_complaints GROUP BY `Consumer.complaint.narrative` ORDER BY total DESC LIMIT 10")
##                          Consumer.complaint.narrative  total
## 1                                                     959635
## 2  I was shocked when I reviewed my credit report and   1098
## 3  I have filed a dispute in regards to the incorrect    819
## 4  I am a victim of identity theft. Due to the Corona    635
## 5  I am a victim of identity theft and this debt does    541
## 6  This particular account situation that is lately f    507
## 7  My name is XXXX XXXX, and I am sending this reques    501
## 8  I am a victim of identity theft. The following acc    486
## 9  RE : Attentively review my formal writ composed de    478
## 10 There are unknown hard inquiries on my credit repo    474


Related to Year, next we will query the top issues of the complaints made by issue from the years 2014-2019 and 2020-2022 and see if there has been any significant changes in the issues being complained about. We will create a new field calculated to the overall percent per time frame of complaints to give us a fair picture of any changes over time. We see that the rate of the issue Incorrect information on your report in 2020-2022 has tripled from 2014-2019.

sqldf("SELECT `Date.received` AS Year, COUNT(*) AS Total_issues FROM iby GROUP BY Year ORDER BY Year DESC")
##    Year Total_issues
## 1  2022        10855
## 2  2021       240435
## 3  2020       191644
## 4  2019       160546
## 5  2018       168951
## 6  2017       168462
## 7  2016       148554
## 8  2015       135470
## 9  2014       124793
## 10 2013        94607
## 11 2012        70561
## 12 2011         2536
sqldf("SELECT Issue, ROUND(CAST(COUNT(*) AS FLOAT) / (SELECT CAST(COUNT(*) AS FLOAT) FROM iby WHERE `Date.received` = 2014 OR `Date.received` = 2015 OR `Date.received` = 2016 OR `Date.received` = 2017 OR `Date.received` = 2018 OR `Date.received` = 2019),4) AS percent FROM iby WHERE `Date.received` = 2014 OR `Date.received` = 2015 OR `Date.received` = 2016 OR `Date.received` = 2017 OR `Date.received` = 2018 OR `Date.received` = 2019 GROUP BY Issue ORDER BY percent DESC LIMIT 10")
##                                       Issue percent
## 1  Loan modification,collection,foreclosure  0.0666
## 2     Cont'd attempts collect debt not owed  0.0619
## 3  Loan servicing, payments, escrow account  0.0596
## 4         Attempts to collect debt not owed  0.0578
## 5      Incorrect information on your report  0.0481
## 6                     Communication tactics  0.0400
## 7                       Managing an account  0.0386
## 8            Trouble during payment process  0.0337
## 9           Written notification about debt  0.0323
## 10          Disclosure verification of debt  0.0317
sqldf("SELECT Issue, ROUND(CAST(COUNT(*) AS FLOAT) / (SELECT CAST(COUNT(*) AS FLOAT) FROM iby WHERE `Date.received` = 2022 OR `Date.received` = 2021 OR `Date.received` = 2020),4) AS percent FROM iby WHERE `Date.received` = 2022 OR `Date.received` = 2021 OR `Date.received` = 2020 GROUP BY Issue ORDER BY percent DESC LIMIT 10")
##                                                                               Issue
## 1                                              Incorrect information on your report
## 2                                                 Attempts to collect debt not owed
## 3                                                               Managing an account
## 4  Problem with a credit reporting company's investigation into an existing problem
## 5                                                   Written notification about debt
## 6                                                    Trouble during payment process
## 7                                   Problem with a purchase shown on your statement
## 8                                                       Improper use of your report
## 9                                                        Struggling to pay mortgage
## 10                      Applying for a mortgage or refinancing an existing mortgage
##    percent
## 1   0.1368
## 2   0.1360
## 3   0.0784
## 4   0.0675
## 5   0.0637
## 6   0.0545
## 7   0.0390
## 8   0.0299
## 9   0.0248
## 10  0.0214


Deep Dive Insights and Conclusions

Our exploration and analysis has drawn out a number of interesting findings and conclusions that we will report on in this section. The conclusions will set the understand and groundwork for the actionable recommendations that are in the next section.

A primary conclusion that can be made is that most complaints to both FI’s or credit agencies are rooted in debt. Whether it’s a problem with a mortgage or credit card or of the credit agencies chasing people for debt not owed, its from issues like this that are the motivators of customers making complaints in the first place. Providing consumers access to debt capital is a primary business model of FI’s so lets explore what needs to be done more efficiently to keep customers satisfied and their complaints at a minimum.

Debt originates with the FI but often gets sent to collections and the credit agencies. We see from our analysis that complaints to the 3 main credit agencies in the US make up the majority of all complaints over FI’s by far.


The data also suggests that there is an overlap of the complaints made to the credit agencies and the FI’s. Remember that all customer debt issues with the credit agencies began as an issue with their FI. Therefore some overlap would be inevitable especially if an issue is not corrected in sufficient time for a customer, which causes them to log a complaint. Therefore, from the rate of complaints per capital plotted on a map of the US by FI’s and credit agencies we can see similar color patterns of the states signalling that FI and credit agency complaints are happening at the same magnitude throughout the United States.



We can see that there is a correlation between the complaints to a credit agency and those to an FI so when we examine deeper into the issues of the credit agencies and we see that the overwhelming reason for complaints according to customers is the credit agency having incorrect information. Even though the issue is with this credit agency, the customer would also understand that the incorrect information comes from their FI and would naturally also hold them accountable.


The last most interesting insight we found in our exploration and analysis was the year-over-year increase in complaints especially in 2020-2022. The Consumer.complaint.narrative is a sparsely populated field but we see a repeated phrase of “Identity Theft” as well as “Fraud” that continuously show up in this field. If we perform a query of percentage of complaints where these were issues from 2014-2019 and 2020-2021(present) we see a small but significant percentage increase.


sqldf("SELECT COUNT(*) AS total FROM iay WHERE `Consumer.complaint.narrative` LIKE '%identity theft%' OR `Consumer.complaint.narrative` LIKE '%Identity theft%' ORDER BY total DESC")
##   total
## 1 57594
sqldf("SELECT COUNT(*) / (SELECT CAST(COUNT(*) AS FLOAT) FROM iay WHERE `Date.received` = 2014 OR `Date.received` = 2015 OR `Date.received` = 2016 OR `Date.received` = 2017 OR `Date.received` = 2018 OR `Date.received` = 2019) AS percent FROM iay WHERE (`Date.received` = 2014 OR `Date.received` = 2015 OR `Date.received` = 2016 OR `Date.received` = 2017 OR `Date.received` = 2018 OR `Date.received` = 2019) AND (`Consumer.complaint.narrative` LIKE '%identity theft%' OR `Consumer.complaint.narrative` LIKE '%Identity theft%' OR `Consumer.complaint.narrative` LIKE '%fraud%')")
##     percent
## 1 0.0634834
sqldf("SELECT COUNT(*) / (SELECT CAST(COUNT(*) AS FLOAT) FROM iay WHERE `Date.received` = 2020 OR `Date.received` = 2021 OR `Date.received` = 2022) AS percent FROM iay WHERE (`Date.received` = 2020 OR `Date.received` = 2021 OR `Date.received` = 2022) AND (`Consumer.complaint.narrative` LIKE '%identity theft%' OR `Consumer.complaint.narrative` LIKE '%Identity theft%' OR `Consumer.complaint.narrative` LIKE '%fraud%')")
##      percent
## 1 0.08655167


Now with most of the values blank in the Consumer.complaint.narrative the slight 2.3% increase in reported identity theft does not give us a definitive correlation in increased cases of identity theft so lets take a look at this data from the FTC regarding reported cases of identity theft and compare it with our dataset on complaints from the CFRB. (FTC data link: https://www.iii.org/fact-statistic/facts-statistics-identity-theft-and-cybercrime).

In this visual we can see the reported cases of identity theft from the FTC with the number of complaints per year from 2014-2020 by the CFRB (2021 data from the FTC is unfortunately not yet available). We see a big increase in 2020 of identity theft and complaints to the CFRB. There also seems to be a correlation of these values when looking at all previous years.

Often when someone’s identity is stolen the fraudsters use the stolen identity to acquire debt in their victims names unbeknownst to them. Therefore, its possible many victims of identity theft may not even know they are victims and when issues come up it could just be registered as incorrect information on their credit report, which we remember is the single most common issue for credit agencies.

We do not have the sufficient data necessary to infer in one way or another what is the reason for the surge of identity theft in 2020, obviously the Covid-19 pandemic struck during this time but there is no information from the data to be certain why. However, this would be something interesting to investigate in a separate business case study.